Overview

Dataset statistics

Number of variables45
Number of observations2028
Missing cells144
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory713.1 KiB
Average record size in memory360.1 B

Variable types

Numeric6
Categorical36
Boolean3

Alerts

city has a high cardinality: 128 distinct values High cardinality
delta_probability_fatality_orbital is highly correlated with delta_probability_fatality_suborbitalHigh correlation
delta_probability_fatality_suborbital is highly correlated with delta_probability_fatality_orbital and 1 other fieldsHigh correlation
delta_probability_fatality_moon_trip is highly correlated with delta_probability_fatality_suborbitalHigh correlation
delta_price_dollars_suborbital is highly correlated with delta_price_dollars_moon_tripHigh correlation
delta_price_dollars_moon_trip is highly correlated with delta_price_dollars_suborbitalHigh correlation
delta_probability_fatality_orbital is highly correlated with delta_probability_fatality_suborbitalHigh correlation
delta_probability_fatality_suborbital is highly correlated with delta_probability_fatality_orbital and 1 other fieldsHigh correlation
delta_probability_fatality_moon_trip is highly correlated with delta_probability_fatality_suborbitalHigh correlation
delta_price_dollars_suborbital is highly correlated with delta_price_dollars_moon_tripHigh correlation
delta_price_dollars_moon_trip is highly correlated with delta_price_dollars_suborbitalHigh correlation
annual_income is highly correlated with household_annual_incomeHigh correlation
state is highly correlated with regionHigh correlation
availability_orbital is highly correlated with delta_probability_fatality_orbitalHigh correlation
average_probability_fatality is highly correlated with delta_probability_fatality_suborbital and 3 other fieldsHigh correlation
number_passengers_moon_trip is highly correlated with delta_probability_fatality_suborbital and 1 other fieldsHigh correlation
region is highly correlated with stateHigh correlation
delta_probability_fatality_suborbital is highly correlated with average_probability_fatality and 6 other fieldsHigh correlation
household_annual_income is highly correlated with annual_incomeHigh correlation
availability_suborbital is highly correlated with average_probability_fatality and 3 other fieldsHigh correlation
takeoff_location_orbital is highly correlated with delta_probability_fatality_orbitalHigh correlation
number_passengers_orbital is highly correlated with delta_probability_fatality_suborbital and 1 other fieldsHigh correlation
price_attribute_orbital is highly correlated with delta_probability_fatality_suborbital and 1 other fieldsHigh correlation
delta_probability_fatality_moon_trip is highly correlated with average_probability_fatality and 5 other fieldsHigh correlation
delta_probability_fatality_orbital is highly correlated with availability_orbital and 6 other fieldsHigh correlation
df_index is highly correlated with annual_income and 3 other fieldsHigh correlation
gender is highly correlated with household_type and 2 other fieldsHigh correlation
annual_income is highly correlated with df_index and 9 other fieldsHigh correlation
household_annual_income is highly correlated with df_index and 8 other fieldsHigh correlation
number_vehicles is highly correlated with household_type and 1 other fieldsHigh correlation
level_education is highly correlated with annual_income and 3 other fieldsHigh correlation
work_type is highly correlated with annual_income and 3 other fieldsHigh correlation
children_home is highly correlated with household_type and 1 other fieldsHigh correlation
household_type is highly correlated with gender and 8 other fieldsHigh correlation
status_in_household is highly correlated with gender and 2 other fieldsHigh correlation
type_residence is highly correlated with stateHigh correlation
housing_tenure_type is highly correlated with household_type and 1 other fieldsHigh correlation
race is highly correlated with stateHigh correlation
risk_activities_sports is highly correlated with stateHigh correlation
price_attribute_suborbital is highly correlated with price_attribute_moon_trip and 1 other fieldsHigh correlation
availability_suborbital is highly correlated with average_probability_fatality and 1 other fieldsHigh correlation
training_suborbital is highly correlated with takeoff_location_orbital and 1 other fieldsHigh correlation
number_passengers_suborbital is highly correlated with number_passengers_moon_tripHigh correlation
takeoff_location_suborbital is highly correlated with takeoff_location_moon_trip and 1 other fieldsHigh correlation
price_attribute_orbital is highly correlated with price_attribute_moon_trip and 4 other fieldsHigh correlation
availability_orbital is highly correlated with takeoff_location_moon_trip and 1 other fieldsHigh correlation
takeoff_location_orbital is highly correlated with training_suborbital and 1 other fieldsHigh correlation
price_attribute_moon_trip is highly correlated with price_attribute_suborbital and 4 other fieldsHigh correlation
number_passengers_moon_trip is highly correlated with number_passengers_suborbitalHigh correlation
takeoff_location_moon_trip is highly correlated with takeoff_location_suborbital and 2 other fieldsHigh correlation
age is highly correlated with df_index and 5 other fieldsHigh correlation
generation_age is highly correlated with age and 1 other fieldsHigh correlation
state is highly correlated with df_index and 16 other fieldsHigh correlation
region is highly correlated with household_type and 1 other fieldsHigh correlation
average_probability_fatality is highly correlated with availability_suborbital and 8 other fieldsHigh correlation
delta_probability_fatality_orbital is highly correlated with price_attribute_orbital and 4 other fieldsHigh correlation
delta_probability_fatality_suborbital is highly correlated with price_attribute_orbital and 3 other fieldsHigh correlation
delta_probability_fatality_moon_trip is highly correlated with availability_suborbital and 4 other fieldsHigh correlation
average_price_dollars is highly correlated with annual_income and 3 other fieldsHigh correlation
delta_price_dollars_orbital is highly correlated with annual_income and 5 other fieldsHigh correlation
delta_price_dollars_suborbital is highly correlated with annual_income and 7 other fieldsHigh correlation
delta_price_dollars_moon_trip is highly correlated with annual_income and 5 other fieldsHigh correlation
city has 48 (2.4%) missing values Missing
state has 48 (2.4%) missing values Missing
region has 48 (2.4%) missing values Missing
price_attribute_suborbital is uniformly distributed Uniform
availability_suborbital is uniformly distributed Uniform
availability_orbital is uniformly distributed Uniform
number_passengers_orbital is uniformly distributed Uniform
takeoff_location_orbital is uniformly distributed Uniform
price_attribute_moon_trip is uniformly distributed Uniform
availability_moon_trip is uniformly distributed Uniform
number_passengers_moon_trip is uniformly distributed Uniform
df_index has unique values Unique
delta_price_dollars_orbital has 169 (8.3%) zeros Zeros
delta_price_dollars_suborbital has 169 (8.3%) zeros Zeros
delta_price_dollars_moon_trip has 169 (8.3%) zeros Zeros

Reproduction

Analysis started2023-04-30 09:57:55.255881
Analysis finished2023-04-30 09:58:13.733147
Duration18.48 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct2028
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1080.884615
Minimum0
Maximum2159
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size16.0 KiB
2023-04-30T11:58:13.820797image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile101.35
Q1530.75
median1085.5
Q31628.25
95-th percentile2045.65
Maximum2159
Range2159
Interquartile range (IQR)1097.5

Descriptive statistics

Standard deviation626.6864769
Coefficient of variation (CV)0.5797903569
Kurtosis-1.201594269
Mean1080.884615
Median Absolute Deviation (MAD)549
Skewness-0.01984256506
Sum2192034
Variance392735.9403
MonotonicityStrictly increasing
2023-04-30T11:58:14.092047image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
14311
 
< 0.1%
14441
 
< 0.1%
14431
 
< 0.1%
14421
 
< 0.1%
14411
 
< 0.1%
14401
 
< 0.1%
14391
 
< 0.1%
14381
 
< 0.1%
14371
 
< 0.1%
Other values (2018)2018
99.5%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
21591
< 0.1%
21581
< 0.1%
21571
< 0.1%
21561
< 0.1%
21551
< 0.1%
21541
< 0.1%
21531
< 0.1%
21521
< 0.1%
21511
< 0.1%
21501
< 0.1%

choice
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
suborbital
570 
moon_trip
545 
not_travel
459 
orbital
454 

Length

Max length10
Median length10
Mean length9.059664694
Min length7

Characters and Unicode

Total characters18373
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmoon_trip
2nd rowsuborbital
3rd rowmoon_trip
4th rowmoon_trip
5th rowsuborbital

Common Values

ValueCountFrequency (%)
suborbital570
28.1%
moon_trip545
26.9%
not_travel459
22.6%
orbital454
22.4%

Length

2023-04-30T11:58:14.211000image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:14.342044image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
suborbital570
28.1%
moon_trip545
26.9%
not_travel459
22.6%
orbital454
22.4%

Most occurring characters

ValueCountFrequency (%)
o2573
14.0%
t2487
13.5%
r2028
11.0%
b1594
8.7%
i1569
8.5%
a1483
8.1%
l1483
8.1%
n1004
 
5.5%
_1004
 
5.5%
s570
 
3.1%
Other values (5)2578
14.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter17369
94.5%
Connector Punctuation1004
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o2573
14.8%
t2487
14.3%
r2028
11.7%
b1594
9.2%
i1569
9.0%
a1483
8.5%
l1483
8.5%
n1004
 
5.8%
s570
 
3.3%
u570
 
3.3%
Other values (4)2008
11.6%
Connector Punctuation
ValueCountFrequency (%)
_1004
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin17369
94.5%
Common1004
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o2573
14.8%
t2487
14.3%
r2028
11.7%
b1594
9.2%
i1569
9.0%
a1483
8.5%
l1483
8.5%
n1004
 
5.8%
s570
 
3.3%
u570
 
3.3%
Other values (4)2008
11.6%
Common
ValueCountFrequency (%)
_1004
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18373
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o2573
14.0%
t2487
13.5%
r2028
11.0%
b1594
8.7%
i1569
8.5%
a1483
8.1%
l1483
8.1%
n1004
 
5.5%
_1004
 
5.5%
s570
 
3.1%
Other values (5)2578
14.0%

gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
male
1200 
female
828 

Length

Max length6
Median length4
Mean length4.816568047
Min length4

Characters and Unicode

Total characters9768
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowmale
3rd rowmale
4th rowmale
5th rowmale

Common Values

ValueCountFrequency (%)
male1200
59.2%
female828
40.8%

Length

2023-04-30T11:58:14.449674image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:14.562409image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
male1200
59.2%
female828
40.8%

Most occurring characters

ValueCountFrequency (%)
e2856
29.2%
m2028
20.8%
a2028
20.8%
l2028
20.8%
f828
 
8.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9768
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2856
29.2%
m2028
20.8%
a2028
20.8%
l2028
20.8%
f828
 
8.5%

Most occurring scripts

ValueCountFrequency (%)
Latin9768
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2856
29.2%
m2028
20.8%
a2028
20.8%
l2028
20.8%
f828
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII9768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2856
29.2%
m2028
20.8%
a2028
20.8%
l2028
20.8%
f828
 
8.5%

annual_income
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
35k_50k
432 
50k_75k
384 
15k_25k
324 
25k_35k
264 
75k_100k
228 
Other values (5)
396 

Length

Max length14
Median length7
Mean length7.899408284
Min length7

Characters and Unicode

Total characters16020
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100k_150k
2nd row100k_150k
3rd row100k_150k
4th row100k_150k
5th row100k_150k

Common Values

ValueCountFrequency (%)
35k_50k432
21.3%
50k_75k384
18.9%
15k_25k324
16.0%
25k_35k264
13.0%
75k_100k228
11.2%
less_than_10k216
10.7%
100k_150k72
 
3.6%
10k_15k60
 
3.0%
150k_200k36
 
1.8%
more_than_200k12
 
0.6%

Length

2023-04-30T11:58:14.661782image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:14.817289image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
35k_50k432
21.3%
50k_75k384
18.9%
15k_25k324
16.0%
25k_35k264
13.0%
75k_100k228
11.2%
less_than_10k216
10.7%
100k_150k72
 
3.6%
10k_15k60
 
3.0%
150k_200k36
 
1.8%
more_than_200k12
 
0.6%

Most occurring characters

ValueCountFrequency (%)
k3828
23.9%
53204
20.0%
_2256
14.1%
01896
11.8%
11068
 
6.7%
3696
 
4.3%
2636
 
4.0%
7612
 
3.8%
s432
 
2.7%
h228
 
1.4%
Other values (8)1164
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8112
50.6%
Lowercase Letter5652
35.3%
Connector Punctuation2256
 
14.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
k3828
67.7%
s432
 
7.6%
h228
 
4.0%
n228
 
4.0%
a228
 
4.0%
e228
 
4.0%
t228
 
4.0%
l216
 
3.8%
m12
 
0.2%
o12
 
0.2%
Decimal Number
ValueCountFrequency (%)
53204
39.5%
01896
23.4%
11068
 
13.2%
3696
 
8.6%
2636
 
7.8%
7612
 
7.5%
Connector Punctuation
ValueCountFrequency (%)
_2256
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common10368
64.7%
Latin5652
35.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
k3828
67.7%
s432
 
7.6%
h228
 
4.0%
n228
 
4.0%
a228
 
4.0%
e228
 
4.0%
t228
 
4.0%
l216
 
3.8%
m12
 
0.2%
o12
 
0.2%
Common
ValueCountFrequency (%)
53204
30.9%
_2256
21.8%
01896
18.3%
11068
 
10.3%
3696
 
6.7%
2636
 
6.1%
7612
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII16020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
k3828
23.9%
53204
20.0%
_2256
14.1%
01896
11.8%
11068
 
6.7%
3696
 
4.3%
2636
 
4.0%
7612
 
3.8%
s432
 
2.7%
h228
 
1.4%
Other values (8)1164
 
7.3%

household_annual_income
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
50k_75k
564 
75k_100k
480 
35k_50k
276 
100k_150k
192 
25k_35k
156 
Other values (5)
360 

Length

Max length14
Median length7
Mean length7.846153846
Min length7

Characters and Unicode

Total characters15912
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row150k_200k
2nd row150k_200k
3rd row150k_200k
4th row150k_200k
5th row150k_200k

Common Values

ValueCountFrequency (%)
50k_75k564
27.8%
75k_100k480
23.7%
35k_50k276
13.6%
100k_150k192
 
9.5%
25k_35k156
 
7.7%
15k_25k120
 
5.9%
150k_200k84
 
4.1%
less_than_10k72
 
3.6%
10k_15k48
 
2.4%
more_than_200k36
 
1.8%

Length

2023-04-30T11:58:15.025481image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:15.181702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
50k_75k564
27.8%
75k_100k480
23.7%
35k_50k276
13.6%
100k_150k192
 
9.5%
25k_35k156
 
7.7%
15k_25k120
 
5.9%
150k_200k84
 
4.1%
less_than_10k72
 
3.6%
10k_15k48
 
2.4%
more_than_200k36
 
1.8%

Most occurring characters

ValueCountFrequency (%)
k3948
24.8%
53036
19.1%
02820
17.7%
_2136
13.4%
11236
 
7.8%
71044
 
6.6%
3432
 
2.7%
2396
 
2.5%
s144
 
0.9%
h108
 
0.7%
Other values (8)612
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8964
56.3%
Lowercase Letter4812
30.2%
Connector Punctuation2136
 
13.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
k3948
82.0%
s144
 
3.0%
h108
 
2.2%
n108
 
2.2%
a108
 
2.2%
e108
 
2.2%
t108
 
2.2%
l72
 
1.5%
m36
 
0.7%
o36
 
0.7%
Decimal Number
ValueCountFrequency (%)
53036
33.9%
02820
31.5%
11236
13.8%
71044
 
11.6%
3432
 
4.8%
2396
 
4.4%
Connector Punctuation
ValueCountFrequency (%)
_2136
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11100
69.8%
Latin4812
30.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
k3948
82.0%
s144
 
3.0%
h108
 
2.2%
n108
 
2.2%
a108
 
2.2%
e108
 
2.2%
t108
 
2.2%
l72
 
1.5%
m36
 
0.7%
o36
 
0.7%
Common
ValueCountFrequency (%)
53036
27.4%
02820
25.4%
_2136
19.2%
11236
11.1%
71044
 
9.4%
3432
 
3.9%
2396
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII15912
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
k3948
24.8%
53036
19.1%
02820
17.7%
_2136
13.4%
11236
 
7.8%
71044
 
6.6%
3432
 
2.7%
2396
 
2.5%
s144
 
0.9%
h108
 
0.7%
Other values (8)612
 
3.8%

number_vehicles
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
2_cars
1068 
1_car
732 
3_cars
132 
4_or_more_cars
 
96

Length

Max length14
Median length6
Mean length6.017751479
Min length5

Characters and Unicode

Total characters12204
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1_car
2nd row1_car
3rd row1_car
4th row1_car
5th row1_car

Common Values

ValueCountFrequency (%)
2_cars1068
52.7%
1_car732
36.1%
3_cars132
 
6.5%
4_or_more_cars96
 
4.7%

Length

2023-04-30T11:58:15.306037image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:15.424388image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
2_cars1068
52.7%
1_car732
36.1%
3_cars132
 
6.5%
4_or_more_cars96
 
4.7%

Most occurring characters

ValueCountFrequency (%)
_2220
18.2%
r2220
18.2%
c2028
16.6%
a2028
16.6%
s1296
10.6%
21068
8.8%
1732
 
6.0%
o192
 
1.6%
3132
 
1.1%
496
 
0.8%
Other values (2)192
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7956
65.2%
Connector Punctuation2220
 
18.2%
Decimal Number2028
 
16.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r2220
27.9%
c2028
25.5%
a2028
25.5%
s1296
16.3%
o192
 
2.4%
m96
 
1.2%
e96
 
1.2%
Decimal Number
ValueCountFrequency (%)
21068
52.7%
1732
36.1%
3132
 
6.5%
496
 
4.7%
Connector Punctuation
ValueCountFrequency (%)
_2220
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7956
65.2%
Common4248
34.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
r2220
27.9%
c2028
25.5%
a2028
25.5%
s1296
16.3%
o192
 
2.4%
m96
 
1.2%
e96
 
1.2%
Common
ValueCountFrequency (%)
_2220
52.3%
21068
25.1%
1732
 
17.2%
3132
 
3.1%
496
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII12204
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_2220
18.2%
r2220
18.2%
c2028
16.6%
a2028
16.6%
s1296
10.6%
21068
8.8%
1732
 
6.0%
o192
 
1.6%
3132
 
1.1%
496
 
0.8%
Other values (2)192
 
1.6%

level_education
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
bachelor_degree
1272 
grad_prof_degree
300 
some_college
252 
high_school_graduate
132 
associate_degree
 
72

Length

Max length20
Median length15
Mean length15.13609467
Min length12

Characters and Unicode

Total characters30696
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgrad_prof_degree
2nd rowgrad_prof_degree
3rd rowgrad_prof_degree
4th rowgrad_prof_degree
5th rowgrad_prof_degree

Common Values

ValueCountFrequency (%)
bachelor_degree1272
62.7%
grad_prof_degree300
 
14.8%
some_college252
 
12.4%
high_school_graduate132
 
6.5%
associate_degree72
 
3.6%

Length

2023-04-30T11:58:15.538085image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:15.675451image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
bachelor_degree1272
62.7%
grad_prof_degree300
 
14.8%
some_college252
 
12.4%
high_school_graduate132
 
6.5%
associate_degree72
 
3.6%

Most occurring characters

ValueCountFrequency (%)
e7164
23.3%
r3648
11.9%
g2460
 
8.0%
_2460
 
8.0%
o2412
 
7.9%
d2076
 
6.8%
a1980
 
6.5%
l1908
 
6.2%
c1728
 
5.6%
h1668
 
5.4%
Other values (8)3192
10.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter28236
92.0%
Connector Punctuation2460
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e7164
25.4%
r3648
12.9%
g2460
 
8.7%
o2412
 
8.5%
d2076
 
7.4%
a1980
 
7.0%
l1908
 
6.8%
c1728
 
6.1%
h1668
 
5.9%
b1272
 
4.5%
Other values (7)1920
 
6.8%
Connector Punctuation
ValueCountFrequency (%)
_2460
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin28236
92.0%
Common2460
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e7164
25.4%
r3648
12.9%
g2460
 
8.7%
o2412
 
8.5%
d2076
 
7.4%
a1980
 
7.0%
l1908
 
6.8%
c1728
 
6.1%
h1668
 
5.9%
b1272
 
4.5%
Other values (7)1920
 
6.8%
Common
ValueCountFrequency (%)
_2460
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII30696
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e7164
23.3%
r3648
11.9%
g2460
 
8.0%
_2460
 
8.0%
o2412
 
7.9%
d2076
 
6.8%
a1980
 
6.5%
l1908
 
6.2%
c1728
 
5.6%
h1668
 
5.4%
Other values (8)3192
10.4%

work_type
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
private
1488 
self_employed
384 
government
 
108
unpaid_work
 
48

Length

Max length13
Median length7
Mean length8.390532544
Min length7

Characters and Unicode

Total characters17016
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprivate
2nd rowprivate
3rd rowprivate
4th rowprivate
5th rowprivate

Common Values

ValueCountFrequency (%)
private1488
73.4%
self_employed384
 
18.9%
government108
 
5.3%
unpaid_work48
 
2.4%

Length

2023-04-30T11:58:15.808385image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:15.941249image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
private1488
73.4%
self_employed384
 
18.9%
government108
 
5.3%
unpaid_work48
 
2.4%

Most occurring characters

ValueCountFrequency (%)
e2856
16.8%
p1920
11.3%
r1644
9.7%
v1596
9.4%
t1596
9.4%
i1536
9.0%
a1536
9.0%
l768
 
4.5%
o540
 
3.2%
m492
 
2.9%
Other values (10)2532
14.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16584
97.5%
Connector Punctuation432
 
2.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2856
17.2%
p1920
11.6%
r1644
9.9%
v1596
9.6%
t1596
9.6%
i1536
9.3%
a1536
9.3%
l768
 
4.6%
o540
 
3.3%
m492
 
3.0%
Other values (9)2100
12.7%
Connector Punctuation
ValueCountFrequency (%)
_432
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16584
97.5%
Common432
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2856
17.2%
p1920
11.6%
r1644
9.9%
v1596
9.6%
t1596
9.6%
i1536
9.3%
a1536
9.3%
l768
 
4.6%
o540
 
3.3%
m492
 
3.0%
Other values (9)2100
12.7%
Common
ValueCountFrequency (%)
_432
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII17016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2856
16.8%
p1920
11.3%
r1644
9.7%
v1596
9.4%
t1596
9.4%
i1536
9.0%
a1536
9.0%
l768
 
4.5%
o540
 
3.2%
m492
 
2.9%
Other values (10)2532
14.9%

children_home
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
0_children
804 
2_children
600 
1_child
528 
4_children
 
36
3_children
 
36

Length

Max length18
Median length10
Mean length9.313609467
Min length7

Characters and Unicode

Total characters18888
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1_child
2nd row1_child
3rd row1_child
4th row1_child
5th row1_child

Common Values

ValueCountFrequency (%)
0_children804
39.6%
2_children600
29.6%
1_child528
26.0%
4_children36
 
1.8%
3_children36
 
1.8%
5_children_or_more24
 
1.2%

Length

2023-04-30T11:58:16.052489image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:16.181554image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0_children804
39.6%
2_children600
29.6%
1_child528
26.0%
4_children36
 
1.8%
3_children36
 
1.8%
5_children_or_more24
 
1.2%

Most occurring characters

ValueCountFrequency (%)
_2076
11.0%
c2028
10.7%
h2028
10.7%
i2028
10.7%
l2028
10.7%
d2028
10.7%
r1548
8.2%
e1524
8.1%
n1500
7.9%
0804
 
4.3%
Other values (7)1296
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter14784
78.3%
Connector Punctuation2076
 
11.0%
Decimal Number2028
 
10.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c2028
13.7%
h2028
13.7%
i2028
13.7%
l2028
13.7%
d2028
13.7%
r1548
10.5%
e1524
10.3%
n1500
10.1%
o48
 
0.3%
m24
 
0.2%
Decimal Number
ValueCountFrequency (%)
0804
39.6%
2600
29.6%
1528
26.0%
436
 
1.8%
336
 
1.8%
524
 
1.2%
Connector Punctuation
ValueCountFrequency (%)
_2076
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14784
78.3%
Common4104
 
21.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
c2028
13.7%
h2028
13.7%
i2028
13.7%
l2028
13.7%
d2028
13.7%
r1548
10.5%
e1524
10.3%
n1500
10.1%
o48
 
0.3%
m24
 
0.2%
Common
ValueCountFrequency (%)
_2076
50.6%
0804
 
19.6%
2600
 
14.6%
1528
 
12.9%
436
 
0.9%
336
 
0.9%
524
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII18888
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_2076
11.0%
c2028
10.7%
h2028
10.7%
i2028
10.7%
l2028
10.7%
d2028
10.7%
r1548
8.2%
e1524
8.1%
n1500
7.9%
0804
 
4.3%
Other values (7)1296
6.9%

household_type
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
couple_with_children
1008 
couple_no_children
312 
male_no_children
156 
male_with_children
132 
alone
120 
Other values (3)
300 

Length

Max length20
Median length20
Mean length17.71597633
Min length5

Characters and Unicode

Total characters35928
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcouple_with_children
2nd rowcouple_with_children
3rd rowcouple_with_children
4th rowcouple_with_children
5th rowcouple_with_children

Common Values

ValueCountFrequency (%)
couple_with_children1008
49.7%
couple_no_children312
 
15.4%
male_no_children156
 
7.7%
male_with_children132
 
6.5%
alone120
 
5.9%
female_no_children120
 
5.9%
female_with_children108
 
5.3%
other72
 
3.6%

Length

2023-04-30T11:58:16.309022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:16.441201image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
couple_with_children1008
49.7%
couple_no_children312
 
15.4%
male_no_children156
 
7.7%
male_with_children132
 
6.5%
alone120
 
5.9%
female_no_children120
 
5.9%
female_with_children108
 
5.3%
other72
 
3.6%

Most occurring characters

ValueCountFrequency (%)
e4092
11.4%
l3792
10.6%
_3672
10.2%
c3156
8.8%
h3156
8.8%
i3084
8.6%
n2544
 
7.1%
o2100
 
5.8%
r1908
 
5.3%
d1836
 
5.1%
Other values (7)6588
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter32256
89.8%
Connector Punctuation3672
 
10.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e4092
12.7%
l3792
11.8%
c3156
9.8%
h3156
9.8%
i3084
9.6%
n2544
7.9%
o2100
 
6.5%
r1908
 
5.9%
d1836
 
5.7%
p1320
 
4.1%
Other values (6)5268
16.3%
Connector Punctuation
ValueCountFrequency (%)
_3672
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin32256
89.8%
Common3672
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e4092
12.7%
l3792
11.8%
c3156
9.8%
h3156
9.8%
i3084
9.6%
n2544
7.9%
o2100
 
6.5%
r1908
 
5.9%
d1836
 
5.7%
p1320
 
4.1%
Other values (6)5268
16.3%
Common
ValueCountFrequency (%)
_3672
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII35928
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e4092
11.4%
l3792
10.6%
_3672
10.2%
c3156
8.8%
h3156
8.8%
i3084
8.6%
n2544
 
7.1%
o2100
 
5.8%
r1908
 
5.3%
d1836
 
5.1%
Other values (7)6588
18.3%

status_in_household
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
head
1212 
spouse
624 
child
 
96
other
 
96

Length

Max length6
Median length4
Mean length4.710059172
Min length4

Characters and Unicode

Total characters9552
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhead
2nd rowhead
3rd rowhead
4th rowhead
5th rowhead

Common Values

ValueCountFrequency (%)
head1212
59.8%
spouse624
30.8%
child96
 
4.7%
other96
 
4.7%

Length

2023-04-30T11:58:16.561073image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:16.697096image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
head1212
59.8%
spouse624
30.8%
child96
 
4.7%
other96
 
4.7%

Most occurring characters

ValueCountFrequency (%)
e1932
20.2%
h1404
14.7%
d1308
13.7%
s1248
13.1%
a1212
12.7%
o720
 
7.5%
p624
 
6.5%
u624
 
6.5%
c96
 
1.0%
i96
 
1.0%
Other values (3)288
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9552
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1932
20.2%
h1404
14.7%
d1308
13.7%
s1248
13.1%
a1212
12.7%
o720
 
7.5%
p624
 
6.5%
u624
 
6.5%
c96
 
1.0%
i96
 
1.0%
Other values (3)288
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Latin9552
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1932
20.2%
h1404
14.7%
d1308
13.7%
s1248
13.1%
a1212
12.7%
o720
 
7.5%
p624
 
6.5%
u624
 
6.5%
c96
 
1.0%
i96
 
1.0%
Other values (3)288
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9552
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1932
20.2%
h1404
14.7%
d1308
13.7%
s1248
13.1%
a1212
12.7%
o720
 
7.5%
p624
 
6.5%
u624
 
6.5%
c96
 
1.0%
i96
 
1.0%
Other values (3)288
 
3.0%

type_residence
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
house
1572 
apartment
444 
other
 
12

Length

Max length9
Median length5
Mean length5.875739645
Min length5

Characters and Unicode

Total characters11916
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhouse
2nd rowhouse
3rd rowhouse
4th rowhouse
5th rowhouse

Common Values

ValueCountFrequency (%)
house1572
77.5%
apartment444
 
21.9%
other12
 
0.6%

Length

2023-04-30T11:58:16.814432image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:16.948920image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
house1572
77.5%
apartment444
 
21.9%
other12
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e2028
17.0%
h1584
13.3%
o1584
13.3%
u1572
13.2%
s1572
13.2%
t900
7.6%
a888
7.5%
r456
 
3.8%
p444
 
3.7%
m444
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11916
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2028
17.0%
h1584
13.3%
o1584
13.3%
u1572
13.2%
s1572
13.2%
t900
7.6%
a888
7.5%
r456
 
3.8%
p444
 
3.7%
m444
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Latin11916
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2028
17.0%
h1584
13.3%
o1584
13.3%
u1572
13.2%
s1572
13.2%
t900
7.6%
a888
7.5%
r456
 
3.8%
p444
 
3.7%
m444
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII11916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2028
17.0%
h1584
13.3%
o1584
13.3%
u1572
13.2%
s1572
13.2%
t900
7.6%
a888
7.5%
r456
 
3.8%
p444
 
3.7%
m444
 
3.7%

housing_tenure_type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
own
1524 
rent
504 

Length

Max length4
Median length3
Mean length3.24852071
Min length3

Characters and Unicode

Total characters6588
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowown
2nd rowown
3rd rowown
4th rowown
5th rowown

Common Values

ValueCountFrequency (%)
own1524
75.1%
rent504
 
24.9%

Length

2023-04-30T11:58:17.056483image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:17.169903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
own1524
75.1%
rent504
 
24.9%

Most occurring characters

ValueCountFrequency (%)
n2028
30.8%
o1524
23.1%
w1524
23.1%
r504
 
7.7%
e504
 
7.7%
t504
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6588
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n2028
30.8%
o1524
23.1%
w1524
23.1%
r504
 
7.7%
e504
 
7.7%
t504
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin6588
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n2028
30.8%
o1524
23.1%
w1524
23.1%
r504
 
7.7%
e504
 
7.7%
t504
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII6588
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n2028
30.8%
o1524
23.1%
w1524
23.1%
r504
 
7.7%
e504
 
7.7%
t504
 
7.7%

origin
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
non_hispanic
1656 
hispanic
372 

Length

Max length12
Median length12
Mean length11.26627219
Min length8

Characters and Unicode

Total characters22848
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownon_hispanic
2nd rownon_hispanic
3rd rownon_hispanic
4th rownon_hispanic
5th rownon_hispanic

Common Values

ValueCountFrequency (%)
non_hispanic1656
81.7%
hispanic372
 
18.3%

Length

2023-04-30T11:58:17.272348image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:17.396737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
non_hispanic1656
81.7%
hispanic372
 
18.3%

Most occurring characters

ValueCountFrequency (%)
n5340
23.4%
i4056
17.8%
h2028
 
8.9%
s2028
 
8.9%
p2028
 
8.9%
a2028
 
8.9%
c2028
 
8.9%
o1656
 
7.2%
_1656
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter21192
92.8%
Connector Punctuation1656
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n5340
25.2%
i4056
19.1%
h2028
 
9.6%
s2028
 
9.6%
p2028
 
9.6%
a2028
 
9.6%
c2028
 
9.6%
o1656
 
7.8%
Connector Punctuation
ValueCountFrequency (%)
_1656
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin21192
92.8%
Common1656
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n5340
25.2%
i4056
19.1%
h2028
 
9.6%
s2028
 
9.6%
p2028
 
9.6%
a2028
 
9.6%
c2028
 
9.6%
o1656
 
7.8%
Common
ValueCountFrequency (%)
_1656
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII22848
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n5340
23.4%
i4056
17.8%
h2028
 
8.9%
s2028
 
8.9%
p2028
 
8.9%
a2028
 
8.9%
c2028
 
8.9%
o1656
 
7.2%
_1656
 
7.2%

race
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
white
1668 
asian
 
144
black
 
108
two_or_more_races
 
96
other_race
 
12

Length

Max length17
Median length5
Mean length5.597633136
Min length5

Characters and Unicode

Total characters11352
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowasian
2nd rowasian
3rd rowasian
4th rowasian
5th rowasian

Common Values

ValueCountFrequency (%)
white1668
82.2%
asian144
 
7.1%
black108
 
5.3%
two_or_more_races96
 
4.7%
other_race12
 
0.6%

Length

2023-04-30T11:58:17.493975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:17.613360image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
white1668
82.2%
asian144
 
7.1%
black108
 
5.3%
two_or_more_races96
 
4.7%
other_race12
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e1884
16.6%
i1812
16.0%
t1776
15.6%
w1764
15.5%
h1680
14.8%
a504
 
4.4%
r312
 
2.7%
o300
 
2.6%
_300
 
2.6%
s240
 
2.1%
Other values (6)780
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11052
97.4%
Connector Punctuation300
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1884
17.0%
i1812
16.4%
t1776
16.1%
w1764
16.0%
h1680
15.2%
a504
 
4.6%
r312
 
2.8%
o300
 
2.7%
s240
 
2.2%
c216
 
2.0%
Other values (5)564
 
5.1%
Connector Punctuation
ValueCountFrequency (%)
_300
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11052
97.4%
Common300
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1884
17.0%
i1812
16.4%
t1776
16.1%
w1764
16.0%
h1680
15.2%
a504
 
4.6%
r312
 
2.8%
o300
 
2.7%
s240
 
2.2%
c216
 
2.0%
Other values (5)564
 
5.1%
Common
ValueCountFrequency (%)
_300
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1884
16.6%
i1812
16.0%
t1776
15.6%
w1764
15.5%
h1680
14.8%
a504
 
4.4%
r312
 
2.7%
o300
 
2.6%
_300
 
2.6%
s240
 
2.1%
Other values (6)780
6.9%

citizenship
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
us_citizen
1980 
other
 
48

Length

Max length10
Median length10
Mean length9.881656805
Min length5

Characters and Unicode

Total characters20040
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowus_citizen
2nd rowus_citizen
3rd rowus_citizen
4th rowus_citizen
5th rowus_citizen

Common Values

ValueCountFrequency (%)
us_citizen1980
97.6%
other48
 
2.4%

Length

2023-04-30T11:58:17.721434image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:17.835374image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
us_citizen1980
97.6%
other48
 
2.4%

Most occurring characters

ValueCountFrequency (%)
i3960
19.8%
t2028
10.1%
e2028
10.1%
u1980
9.9%
s1980
9.9%
_1980
9.9%
c1980
9.9%
z1980
9.9%
n1980
9.9%
o48
 
0.2%
Other values (2)96
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter18060
90.1%
Connector Punctuation1980
 
9.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i3960
21.9%
t2028
11.2%
e2028
11.2%
u1980
11.0%
s1980
11.0%
c1980
11.0%
z1980
11.0%
n1980
11.0%
o48
 
0.3%
h48
 
0.3%
Connector Punctuation
ValueCountFrequency (%)
_1980
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin18060
90.1%
Common1980
 
9.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i3960
21.9%
t2028
11.2%
e2028
11.2%
u1980
11.0%
s1980
11.0%
c1980
11.0%
z1980
11.0%
n1980
11.0%
o48
 
0.3%
h48
 
0.3%
Common
ValueCountFrequency (%)
_1980
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII20040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i3960
19.8%
t2028
10.1%
e2028
10.1%
u1980
9.9%
s1980
9.9%
_1980
9.9%
c1980
9.9%
z1980
9.9%
n1980
9.9%
o48
 
0.2%
Other values (2)96
 
0.5%

risk_activities_sports
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
rarely
900 
never
720 
often
408 

Length

Max length6
Median length5
Mean length5.443786982
Min length5

Characters and Unicode

Total characters11040
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownever
2nd rownever
3rd rownever
4th rownever
5th rownever

Common Values

ValueCountFrequency (%)
rarely900
44.4%
never720
35.5%
often408
20.1%

Length

2023-04-30T11:58:17.930206image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:18.042762image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
rarely900
44.4%
never720
35.5%
often408
20.1%

Most occurring characters

ValueCountFrequency (%)
e2748
24.9%
r2520
22.8%
n1128
10.2%
a900
 
8.2%
l900
 
8.2%
y900
 
8.2%
v720
 
6.5%
o408
 
3.7%
f408
 
3.7%
t408
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11040
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2748
24.9%
r2520
22.8%
n1128
10.2%
a900
 
8.2%
l900
 
8.2%
y900
 
8.2%
v720
 
6.5%
o408
 
3.7%
f408
 
3.7%
t408
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
Latin11040
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2748
24.9%
r2520
22.8%
n1128
10.2%
a900
 
8.2%
l900
 
8.2%
y900
 
8.2%
v720
 
6.5%
o408
 
3.7%
f408
 
3.7%
t408
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII11040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2748
24.9%
r2520
22.8%
n1128
10.2%
a900
 
8.2%
l900
 
8.2%
y900
 
8.2%
v720
 
6.5%
o408
 
3.7%
f408
 
3.7%
t408
 
3.7%

price_attribute_suborbital
Categorical

HIGH CORRELATION
UNIFORM

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
362_perc_annual_income
676 
50_perc_annual_income
676 
3_perc_annual_income
676 

Length

Max length22
Median length21
Mean length21
Min length20

Characters and Unicode

Total characters42588
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row362_perc_annual_income
2nd row362_perc_annual_income
3rd row50_perc_annual_income
4th row3_perc_annual_income
5th row50_perc_annual_income

Common Values

ValueCountFrequency (%)
362_perc_annual_income676
33.3%
50_perc_annual_income676
33.3%
3_perc_annual_income676
33.3%

Length

2023-04-30T11:58:18.160286image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:18.296324image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
362_perc_annual_income676
33.3%
50_perc_annual_income676
33.3%
3_perc_annual_income676
33.3%

Most occurring characters

ValueCountFrequency (%)
_6084
14.3%
n6084
14.3%
a4056
9.5%
e4056
9.5%
c4056
9.5%
i2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
o2028
 
4.8%
Other values (7)8112
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter32448
76.2%
Connector Punctuation6084
 
14.3%
Decimal Number4056
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n6084
18.8%
a4056
12.5%
e4056
12.5%
c4056
12.5%
i2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
o2028
 
6.2%
u2028
 
6.2%
Decimal Number
ValueCountFrequency (%)
31352
33.3%
6676
16.7%
2676
16.7%
5676
16.7%
0676
16.7%
Connector Punctuation
ValueCountFrequency (%)
_6084
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin32448
76.2%
Common10140
 
23.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n6084
18.8%
a4056
12.5%
e4056
12.5%
c4056
12.5%
i2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
o2028
 
6.2%
u2028
 
6.2%
Common
ValueCountFrequency (%)
_6084
60.0%
31352
 
13.3%
6676
 
6.7%
2676
 
6.7%
5676
 
6.7%
0676
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII42588
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_6084
14.3%
n6084
14.3%
a4056
9.5%
e4056
9.5%
c4056
9.5%
i2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
o2028
 
4.8%
Other values (7)8112
19.0%

availability_suborbital
Categorical

HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
immediate
1014 
in_5_years
1014 

Length

Max length10
Median length9.5
Mean length9.5
Min length9

Characters and Unicode

Total characters19266
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowimmediate
2nd rowimmediate
3rd rowin_5_years
4th rowin_5_years
5th rowin_5_years

Common Values

ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Length

2023-04-30T11:58:18.407629image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:18.519303image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Most occurring characters

ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16224
84.2%
Connector Punctuation2028
 
10.5%
Decimal Number1014
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_2028
100.0%
Decimal Number
ValueCountFrequency (%)
51014
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16224
84.2%
Common3042
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Common
ValueCountFrequency (%)
_2028
66.7%
51014
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII19266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%

training_suborbital
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
False
1014 
True
1014 
ValueCountFrequency (%)
False1014
50.0%
True1014
50.0%
2023-04-30T11:58:18.619835image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

number_passengers_suborbital
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
one
1021 
more_than_one
1007 

Length

Max length13
Median length3
Mean length7.965483235
Min length3

Characters and Unicode

Total characters16154
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowone
2nd rowone
3rd rowone
4th rowmore_than_one
5th rowmore_than_one

Common Values

ValueCountFrequency (%)
one1021
50.3%
more_than_one1007
49.7%

Length

2023-04-30T11:58:18.706384image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:18.806198image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
one1021
50.3%
more_than_one1007
49.7%

Most occurring characters

ValueCountFrequency (%)
o3035
18.8%
n3035
18.8%
e3035
18.8%
_2014
12.5%
m1007
 
6.2%
r1007
 
6.2%
t1007
 
6.2%
h1007
 
6.2%
a1007
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter14140
87.5%
Connector Punctuation2014
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o3035
21.5%
n3035
21.5%
e3035
21.5%
m1007
 
7.1%
r1007
 
7.1%
t1007
 
7.1%
h1007
 
7.1%
a1007
 
7.1%
Connector Punctuation
ValueCountFrequency (%)
_2014
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14140
87.5%
Common2014
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o3035
21.5%
n3035
21.5%
e3035
21.5%
m1007
 
7.1%
r1007
 
7.1%
t1007
 
7.1%
h1007
 
7.1%
a1007
 
7.1%
Common
ValueCountFrequency (%)
_2014
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII16154
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o3035
18.8%
n3035
18.8%
e3035
18.8%
_2014
12.5%
m1007
 
6.2%
r1007
 
6.2%
t1007
 
6.2%
h1007
 
6.2%
a1007
 
6.2%

takeoff_location_suborbital
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
usa
1021 
other
1007 

Length

Max length5
Median length3
Mean length3.993096647
Min length3

Characters and Unicode

Total characters8098
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowusa
2nd rowusa
3rd rowusa
4th rowusa
5th rowusa

Common Values

ValueCountFrequency (%)
usa1021
50.3%
other1007
49.7%

Length

2023-04-30T11:58:18.896989image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:19.214287image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
usa1021
50.3%
other1007
49.7%

Most occurring characters

ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8098
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring scripts

ValueCountFrequency (%)
Latin8098
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII8098
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

price_attribute_orbital
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
362_perc_annual_income
683 
3_perc_annual_income
683 
50_perc_annual_income
662 

Length

Max length22
Median length21
Mean length21
Min length20

Characters and Unicode

Total characters42588
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row362_perc_annual_income
2nd row362_perc_annual_income
3rd row362_perc_annual_income
4th row50_perc_annual_income
5th row50_perc_annual_income

Common Values

ValueCountFrequency (%)
362_perc_annual_income683
33.7%
3_perc_annual_income683
33.7%
50_perc_annual_income662
32.6%

Length

2023-04-30T11:58:19.337848image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:19.456068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
362_perc_annual_income683
33.7%
3_perc_annual_income683
33.7%
50_perc_annual_income662
32.6%

Most occurring characters

ValueCountFrequency (%)
_6084
14.3%
n6084
14.3%
a4056
9.5%
e4056
9.5%
c4056
9.5%
i2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
o2028
 
4.8%
Other values (7)8112
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter32448
76.2%
Connector Punctuation6084
 
14.3%
Decimal Number4056
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n6084
18.8%
a4056
12.5%
e4056
12.5%
c4056
12.5%
i2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
o2028
 
6.2%
u2028
 
6.2%
Decimal Number
ValueCountFrequency (%)
31366
33.7%
6683
16.8%
2683
16.8%
5662
16.3%
0662
16.3%
Connector Punctuation
ValueCountFrequency (%)
_6084
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin32448
76.2%
Common10140
 
23.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n6084
18.8%
a4056
12.5%
e4056
12.5%
c4056
12.5%
i2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
o2028
 
6.2%
u2028
 
6.2%
Common
ValueCountFrequency (%)
_6084
60.0%
31366
 
13.5%
6683
 
6.7%
2683
 
6.7%
5662
 
6.5%
0662
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII42588
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_6084
14.3%
n6084
14.3%
a4056
9.5%
e4056
9.5%
c4056
9.5%
i2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
o2028
 
4.8%
Other values (7)8112
19.0%

availability_orbital
Categorical

HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
immediate
1014 
in_5_years
1014 

Length

Max length10
Median length9.5
Mean length9.5
Min length9

Characters and Unicode

Total characters19266
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowimmediate
2nd rowimmediate
3rd rowin_5_years
4th rowimmediate
5th rowimmediate

Common Values

ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Length

2023-04-30T11:58:19.557818image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:19.686225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Most occurring characters

ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16224
84.2%
Connector Punctuation2028
 
10.5%
Decimal Number1014
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_2028
100.0%
Decimal Number
ValueCountFrequency (%)
51014
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16224
84.2%
Common3042
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Common
ValueCountFrequency (%)
_2028
66.7%
51014
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII19266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
True
1014 
False
1014 
ValueCountFrequency (%)
True1014
50.0%
False1014
50.0%
2023-04-30T11:58:19.812088image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

number_passengers_orbital
Categorical

HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
one
1014 
more_than_one
1014 

Length

Max length13
Median length8
Mean length8
Min length3

Characters and Unicode

Total characters16224
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowone
2nd rowmore_than_one
3rd rowone
4th rowone
5th rowmore_than_one

Common Values

ValueCountFrequency (%)
one1014
50.0%
more_than_one1014
50.0%

Length

2023-04-30T11:58:19.906888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:20.011396image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
one1014
50.0%
more_than_one1014
50.0%

Most occurring characters

ValueCountFrequency (%)
o3042
18.8%
n3042
18.8%
e3042
18.8%
_2028
12.5%
m1014
 
6.2%
r1014
 
6.2%
t1014
 
6.2%
h1014
 
6.2%
a1014
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter14196
87.5%
Connector Punctuation2028
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o3042
21.4%
n3042
21.4%
e3042
21.4%
m1014
 
7.1%
r1014
 
7.1%
t1014
 
7.1%
h1014
 
7.1%
a1014
 
7.1%
Connector Punctuation
ValueCountFrequency (%)
_2028
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14196
87.5%
Common2028
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o3042
21.4%
n3042
21.4%
e3042
21.4%
m1014
 
7.1%
r1014
 
7.1%
t1014
 
7.1%
h1014
 
7.1%
a1014
 
7.1%
Common
ValueCountFrequency (%)
_2028
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII16224
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o3042
18.8%
n3042
18.8%
e3042
18.8%
_2028
12.5%
m1014
 
6.2%
r1014
 
6.2%
t1014
 
6.2%
h1014
 
6.2%
a1014
 
6.2%

takeoff_location_orbital
Categorical

HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
usa
1014 
other
1014 

Length

Max length5
Median length4
Mean length4
Min length3

Characters and Unicode

Total characters8112
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowusa
2nd rowother
3rd rowother
4th rowother
5th rowother

Common Values

ValueCountFrequency (%)
usa1014
50.0%
other1014
50.0%

Length

2023-04-30T11:58:20.108196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:20.219349image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
usa1014
50.0%
other1014
50.0%

Most occurring characters

ValueCountFrequency (%)
u1014
12.5%
s1014
12.5%
a1014
12.5%
o1014
12.5%
t1014
12.5%
h1014
12.5%
e1014
12.5%
r1014
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8112
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u1014
12.5%
s1014
12.5%
a1014
12.5%
o1014
12.5%
t1014
12.5%
h1014
12.5%
e1014
12.5%
r1014
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin8112
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u1014
12.5%
s1014
12.5%
a1014
12.5%
o1014
12.5%
t1014
12.5%
h1014
12.5%
e1014
12.5%
r1014
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII8112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u1014
12.5%
s1014
12.5%
a1014
12.5%
o1014
12.5%
t1014
12.5%
h1014
12.5%
e1014
12.5%
r1014
12.5%

price_attribute_moon_trip
Categorical

HIGH CORRELATION
UNIFORM

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
3_perc_annual_income
676 
362_perc_annual_income
676 
50_perc_annual_income
676 

Length

Max length22
Median length21
Mean length21
Min length20

Characters and Unicode

Total characters42588
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3_perc_annual_income
2nd row362_perc_annual_income
3rd row50_perc_annual_income
4th row50_perc_annual_income
5th row3_perc_annual_income

Common Values

ValueCountFrequency (%)
3_perc_annual_income676
33.3%
362_perc_annual_income676
33.3%
50_perc_annual_income676
33.3%

Length

2023-04-30T11:58:20.316270image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:20.439131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
3_perc_annual_income676
33.3%
362_perc_annual_income676
33.3%
50_perc_annual_income676
33.3%

Most occurring characters

ValueCountFrequency (%)
n6084
14.3%
_6084
14.3%
e4056
9.5%
c4056
9.5%
a4056
9.5%
u2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
l2028
 
4.8%
Other values (7)8112
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter32448
76.2%
Connector Punctuation6084
 
14.3%
Decimal Number4056
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n6084
18.8%
e4056
12.5%
c4056
12.5%
a4056
12.5%
u2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
l2028
 
6.2%
i2028
 
6.2%
Decimal Number
ValueCountFrequency (%)
31352
33.3%
6676
16.7%
2676
16.7%
5676
16.7%
0676
16.7%
Connector Punctuation
ValueCountFrequency (%)
_6084
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin32448
76.2%
Common10140
 
23.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n6084
18.8%
e4056
12.5%
c4056
12.5%
a4056
12.5%
u2028
 
6.2%
m2028
 
6.2%
p2028
 
6.2%
r2028
 
6.2%
l2028
 
6.2%
i2028
 
6.2%
Common
ValueCountFrequency (%)
_6084
60.0%
31352
 
13.3%
6676
 
6.7%
2676
 
6.7%
5676
 
6.7%
0676
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII42588
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n6084
14.3%
_6084
14.3%
e4056
9.5%
c4056
9.5%
a4056
9.5%
u2028
 
4.8%
m2028
 
4.8%
p2028
 
4.8%
r2028
 
4.8%
l2028
 
4.8%
Other values (7)8112
19.0%

availability_moon_trip
Categorical

UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
immediate
1014 
in_5_years
1014 

Length

Max length10
Median length9.5
Mean length9.5
Min length9

Characters and Unicode

Total characters19266
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowimmediate
2nd rowin_5_years
3rd rowimmediate
4th rowin_5_years
5th rowimmediate

Common Values

ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Length

2023-04-30T11:58:20.539104image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:20.651003image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
immediate1014
50.0%
in_5_years1014
50.0%

Most occurring characters

ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16224
84.2%
Connector Punctuation2028
 
10.5%
Decimal Number1014
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_2028
100.0%
Decimal Number
ValueCountFrequency (%)
51014
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16224
84.2%
Common3042
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
i3042
18.8%
e3042
18.8%
m2028
12.5%
a2028
12.5%
d1014
 
6.2%
t1014
 
6.2%
n1014
 
6.2%
y1014
 
6.2%
r1014
 
6.2%
s1014
 
6.2%
Common
ValueCountFrequency (%)
_2028
66.7%
51014
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII19266
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i3042
15.8%
e3042
15.8%
m2028
10.5%
a2028
10.5%
_2028
10.5%
d1014
 
5.3%
t1014
 
5.3%
n1014
 
5.3%
51014
 
5.3%
y1014
 
5.3%
Other values (2)2028
10.5%
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
False
1014 
True
1014 
ValueCountFrequency (%)
False1014
50.0%
True1014
50.0%
2023-04-30T11:58:20.749319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

number_passengers_moon_trip
Categorical

HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
more_than_one
1014 
one
1014 

Length

Max length13
Median length8
Mean length8
Min length3

Characters and Unicode

Total characters16224
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmore_than_one
2nd rowone
3rd rowmore_than_one
4th rowone
5th rowmore_than_one

Common Values

ValueCountFrequency (%)
more_than_one1014
50.0%
one1014
50.0%

Length

2023-04-30T11:58:20.836339image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:20.937362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
more_than_one1014
50.0%
one1014
50.0%

Most occurring characters

ValueCountFrequency (%)
o3042
18.8%
e3042
18.8%
n3042
18.8%
_2028
12.5%
m1014
 
6.2%
r1014
 
6.2%
t1014
 
6.2%
h1014
 
6.2%
a1014
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter14196
87.5%
Connector Punctuation2028
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o3042
21.4%
e3042
21.4%
n3042
21.4%
m1014
 
7.1%
r1014
 
7.1%
t1014
 
7.1%
h1014
 
7.1%
a1014
 
7.1%
Connector Punctuation
ValueCountFrequency (%)
_2028
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14196
87.5%
Common2028
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o3042
21.4%
e3042
21.4%
n3042
21.4%
m1014
 
7.1%
r1014
 
7.1%
t1014
 
7.1%
h1014
 
7.1%
a1014
 
7.1%
Common
ValueCountFrequency (%)
_2028
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII16224
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o3042
18.8%
e3042
18.8%
n3042
18.8%
_2028
12.5%
m1014
 
6.2%
r1014
 
6.2%
t1014
 
6.2%
h1014
 
6.2%
a1014
 
6.2%

takeoff_location_moon_trip
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
usa
1021 
other
1007 

Length

Max length5
Median length3
Mean length3.993096647
Min length3

Characters and Unicode

Total characters8098
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowusa
2nd rowusa
3rd rowother
4th rowusa
5th rowusa

Common Values

ValueCountFrequency (%)
usa1021
50.3%
other1007
49.7%

Length

2023-04-30T11:58:21.032599image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:21.143928image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
usa1021
50.3%
other1007
49.7%

Most occurring characters

ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8098
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring scripts

ValueCountFrequency (%)
Latin8098
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII8098
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u1021
12.6%
s1021
12.6%
a1021
12.6%
o1007
12.4%
t1007
12.4%
h1007
12.4%
e1007
12.4%
r1007
12.4%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct45
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.86390533
Minimum21
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.0 KiB
2023-04-30T11:58:21.245997image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile26
Q131
median37
Q351
95-th percentile64
Maximum72
Range51
Interquartile range (IQR)20

Descriptive statistics

Standard deviation12.19182395
Coefficient of variation (CV)0.2983519038
Kurtosis-0.8041997522
Mean40.86390533
Median Absolute Deviation (MAD)8
Skewness0.5635826803
Sum82872
Variance148.6405712
MonotonicityNot monotonic
2023-04-30T11:58:21.379958image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
27156
 
7.7%
32120
 
5.9%
30108
 
5.3%
3496
 
4.7%
3796
 
4.7%
2972
 
3.6%
3372
 
3.6%
4272
 
3.6%
4172
 
3.6%
5460
 
3.0%
Other values (35)1104
54.4%
ValueCountFrequency (%)
2112
 
0.6%
2412
 
0.6%
2560
 
3.0%
2636
 
1.8%
27156
7.7%
2848
 
2.4%
2972
3.6%
30108
5.3%
3160
 
3.0%
32120
5.9%
ValueCountFrequency (%)
7212
 
0.6%
6912
 
0.6%
6712
 
0.6%
6612
 
0.6%
6536
1.8%
6448
2.4%
6112
 
0.6%
6036
1.8%
5912
 
0.6%
5848
2.4%

generation_age
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
millenials
1092 
gen_x
468 
centennials
276 
baby_boomers
192 

Length

Max length12
Median length10
Mean length9.171597633
Min length5

Characters and Unicode

Total characters18600
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmillenials
2nd rowmillenials
3rd rowmillenials
4th rowmillenials
5th rowmillenials

Common Values

ValueCountFrequency (%)
millenials1092
53.8%
gen_x468
23.1%
centennials276
 
13.6%
baby_boomers192
 
9.5%

Length

2023-04-30T11:58:21.500606image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:21.617361image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
millenials1092
53.8%
gen_x468
23.1%
centennials276
 
13.6%
baby_boomers192
 
9.5%

Most occurring characters

ValueCountFrequency (%)
l3552
19.1%
i2460
13.2%
n2388
12.8%
e2304
12.4%
a1560
8.4%
s1560
8.4%
m1284
 
6.9%
_660
 
3.5%
b576
 
3.1%
g468
 
2.5%
Other values (6)1788
9.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter17940
96.5%
Connector Punctuation660
 
3.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l3552
19.8%
i2460
13.7%
n2388
13.3%
e2304
12.8%
a1560
8.7%
s1560
8.7%
m1284
 
7.2%
b576
 
3.2%
g468
 
2.6%
x468
 
2.6%
Other values (5)1320
 
7.4%
Connector Punctuation
ValueCountFrequency (%)
_660
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin17940
96.5%
Common660
 
3.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l3552
19.8%
i2460
13.7%
n2388
13.3%
e2304
12.8%
a1560
8.7%
s1560
8.7%
m1284
 
7.2%
b576
 
3.2%
g468
 
2.6%
x468
 
2.6%
Other values (5)1320
 
7.4%
Common
ValueCountFrequency (%)
_660
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l3552
19.1%
i2460
13.2%
n2388
12.8%
e2304
12.4%
a1560
8.4%
s1560
8.4%
m1284
 
6.9%
_660
 
3.5%
b576
 
3.1%
g468
 
2.5%
Other values (6)1788
9.6%

city
Categorical

HIGH CARDINALITY
MISSING

Distinct128
Distinct (%)6.5%
Missing48
Missing (%)2.4%
Memory size16.0 KiB
New York
 
96
Los Angeles
 
48
Compton
 
36
Anchorage
 
36
Houston
 
36
Other values (123)
1728 

Length

Max length19
Median length15
Mean length9.060606061
Min length4

Characters and Unicode

Total characters17940
Distinct characters47
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRochester
2nd rowRochester
3rd rowRochester
4th rowRochester
5th rowRochester

Common Values

ValueCountFrequency (%)
New York96
 
4.7%
Los Angeles48
 
2.4%
Compton36
 
1.8%
Anchorage36
 
1.8%
Houston36
 
1.8%
Stamford36
 
1.8%
Anaheim36
 
1.8%
Minneapolis24
 
1.2%
San Jose24
 
1.2%
South San Francisco24
 
1.2%
Other values (118)1584
78.1%
(Missing)48
 
2.4%

Length

2023-04-30T11:58:21.730870image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
new108
 
4.1%
san108
 
4.1%
york96
 
3.6%
south48
 
1.8%
los48
 
1.8%
angeles48
 
1.8%
valley48
 
1.8%
francisco48
 
1.8%
park36
 
1.4%
vegas36
 
1.4%
Other values (139)2040
76.6%

Most occurring characters

ValueCountFrequency (%)
e1584
 
8.8%
o1512
 
8.4%
a1488
 
8.3%
n1452
 
8.1%
l1128
 
6.3%
i1068
 
6.0%
r1068
 
6.0%
s900
 
5.0%
t804
 
4.5%
684
 
3.8%
Other values (37)6252
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter14592
81.3%
Uppercase Letter2664
 
14.8%
Space Separator684
 
3.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1584
10.9%
o1512
10.4%
a1488
10.2%
n1452
10.0%
l1128
 
7.7%
i1068
 
7.3%
r1068
 
7.3%
s900
 
6.2%
t804
 
5.5%
u432
 
3.0%
Other values (13)3156
21.6%
Uppercase Letter
ValueCountFrequency (%)
S360
13.5%
A276
 
10.4%
M216
 
8.1%
C216
 
8.1%
L216
 
8.1%
B180
 
6.8%
N132
 
5.0%
V120
 
4.5%
H120
 
4.5%
P120
 
4.5%
Other values (13)708
26.6%
Space Separator
ValueCountFrequency (%)
684
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin17256
96.2%
Common684
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1584
 
9.2%
o1512
 
8.8%
a1488
 
8.6%
n1452
 
8.4%
l1128
 
6.5%
i1068
 
6.2%
r1068
 
6.2%
s900
 
5.2%
t804
 
4.7%
u432
 
2.5%
Other values (36)5820
33.7%
Common
ValueCountFrequency (%)
684
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII17940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1584
 
8.8%
o1512
 
8.4%
a1488
 
8.3%
n1452
 
8.1%
l1128
 
6.3%
i1068
 
6.0%
r1068
 
6.0%
s900
 
5.0%
t804
 
4.5%
684
 
3.8%
Other values (37)6252
34.8%

state
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct30
Distinct (%)1.5%
Missing48
Missing (%)2.4%
Memory size16.0 KiB
CA
504 
TX
216 
NY
180 
FL
180 
OH
 
72
Other values (25)
828 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3960
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMN
2nd rowMN
3rd rowMN
4th rowMN
5th rowMN

Common Values

ValueCountFrequency (%)
CA504
24.9%
TX216
 
10.7%
NY180
 
8.9%
FL180
 
8.9%
OH72
 
3.6%
CT60
 
3.0%
WA60
 
3.0%
PA48
 
2.4%
MD48
 
2.4%
NJ48
 
2.4%
Other values (20)564
27.8%

Length

2023-04-30T11:58:21.843778image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca504
25.5%
tx216
 
10.9%
ny180
 
9.1%
fl180
 
9.1%
oh72
 
3.6%
ct60
 
3.0%
wa60
 
3.0%
pa48
 
2.4%
md48
 
2.4%
nj48
 
2.4%
Other values (20)564
28.5%

Most occurring characters

ValueCountFrequency (%)
A792
20.0%
C684
17.3%
N396
10.0%
T276
 
7.0%
L252
 
6.4%
X216
 
5.5%
Y216
 
5.5%
F180
 
4.5%
O120
 
3.0%
W108
 
2.7%
Other values (13)720
18.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3960
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A792
20.0%
C684
17.3%
N396
10.0%
T276
 
7.0%
L252
 
6.4%
X216
 
5.5%
Y216
 
5.5%
F180
 
4.5%
O120
 
3.0%
W108
 
2.7%
Other values (13)720
18.2%

Most occurring scripts

ValueCountFrequency (%)
Latin3960
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A792
20.0%
C684
17.3%
N396
10.0%
T276
 
7.0%
L252
 
6.4%
X216
 
5.5%
Y216
 
5.5%
F180
 
4.5%
O120
 
3.0%
W108
 
2.7%
Other values (13)720
18.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII3960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A792
20.0%
C684
17.3%
N396
10.0%
T276
 
7.0%
L252
 
6.4%
X216
 
5.5%
Y216
 
5.5%
F180
 
4.5%
O120
 
3.0%
W108
 
2.7%
Other values (13)720
18.2%

region
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)0.2%
Missing48
Missing (%)2.4%
Memory size16.0 KiB
west
708 
south
708 
northeast
348 
midwest
216 

Length

Max length9
Median length7
Mean length5.563636364
Min length4

Characters and Unicode

Total characters11016
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmidwest
2nd rowmidwest
3rd rowmidwest
4th rowmidwest
5th rowmidwest

Common Values

ValueCountFrequency (%)
west708
34.9%
south708
34.9%
northeast348
17.2%
midwest216
 
10.7%
(Missing)48
 
2.4%

Length

2023-04-30T11:58:21.950555image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:22.075013image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
west708
35.8%
south708
35.8%
northeast348
17.6%
midwest216
 
10.9%

Most occurring characters

ValueCountFrequency (%)
t2328
21.1%
s1980
18.0%
e1272
11.5%
o1056
9.6%
h1056
9.6%
w924
 
8.4%
u708
 
6.4%
n348
 
3.2%
r348
 
3.2%
a348
 
3.2%
Other values (3)648
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11016
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t2328
21.1%
s1980
18.0%
e1272
11.5%
o1056
9.6%
h1056
9.6%
w924
 
8.4%
u708
 
6.4%
n348
 
3.2%
r348
 
3.2%
a348
 
3.2%
Other values (3)648
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin11016
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t2328
21.1%
s1980
18.0%
e1272
11.5%
o1056
9.6%
h1056
9.6%
w924
 
8.4%
u708
 
6.4%
n348
 
3.2%
r348
 
3.2%
a348
 
3.2%
Other values (3)648
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII11016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t2328
21.1%
s1980
18.0%
e1272
11.5%
o1056
9.6%
h1056
9.6%
w924
 
8.4%
u708
 
6.4%
n348
 
3.2%
r348
 
3.2%
a348
 
3.2%
Other values (3)648
 
5.9%

average_probability_fatality
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
5.166666666666667
852 
2.8333333333333335
838 
0.5
169 
7.5
169 

Length

Max length18
Median length17
Mean length15.07988166
Min length3

Characters and Unicode

Total characters30582
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.8333333333333335
2nd row2.8333333333333335
3rd row2.8333333333333335
4th row5.166666666666667
5th row2.8333333333333335

Common Values

ValueCountFrequency (%)
5.166666666666667852
42.0%
2.8333333333333335838
41.3%
0.5169
 
8.3%
7.5169
 
8.3%

Length

2023-04-30T11:58:22.196392image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:22.323244image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
5.166666666666667852
42.0%
2.8333333333333335838
41.3%
0.5169
 
8.3%
7.5169
 
8.3%

Most occurring characters

ValueCountFrequency (%)
311732
38.4%
611076
36.2%
52028
 
6.6%
.2028
 
6.6%
71021
 
3.3%
1852
 
2.8%
2838
 
2.7%
8838
 
2.7%
0169
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number28554
93.4%
Other Punctuation2028
 
6.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
311732
41.1%
611076
38.8%
52028
 
7.1%
71021
 
3.6%
1852
 
3.0%
2838
 
2.9%
8838
 
2.9%
0169
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.2028
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common30582
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
311732
38.4%
611076
36.2%
52028
 
6.6%
.2028
 
6.6%
71021
 
3.3%
1852
 
2.8%
2838
 
2.7%
8838
 
2.7%
0169
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII30582
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
311732
38.4%
611076
36.2%
52028
 
6.6%
.2028
 
6.6%
71021
 
3.3%
1852
 
2.8%
2838
 
2.7%
8838
 
2.7%
0169
 
0.6%

delta_probability_fatality_orbital
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
-2.333333333333333
683 
2.3333333333333335
669 
0.0
338 
4.666666666666667
169 
-4.666666666666666
169 

Length

Max length18
Median length18
Mean length15.41666667
Min length3

Characters and Unicode

Total characters31265
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.3333333333333335
2nd row2.3333333333333335
3rd row2.3333333333333335
4th row-2.333333333333333
5th row2.3333333333333335

Common Values

ValueCountFrequency (%)
-2.333333333333333683
33.7%
2.3333333333333335669
33.0%
0.0338
16.7%
4.666666666666667169
 
8.3%
-4.666666666666666169
 
8.3%

Length

2023-04-30T11:58:22.443858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:22.582634image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
2.333333333333333683
33.7%
2.3333333333333335669
33.0%
0.0338
16.7%
4.666666666666667169
 
8.3%
4.666666666666666169
 
8.3%

Most occurring characters

ValueCountFrequency (%)
320280
64.9%
64901
 
15.7%
.2028
 
6.5%
21352
 
4.3%
-852
 
2.7%
0676
 
2.2%
5669
 
2.1%
4338
 
1.1%
7169
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number28385
90.8%
Other Punctuation2028
 
6.5%
Dash Punctuation852
 
2.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
320280
71.4%
64901
 
17.3%
21352
 
4.8%
0676
 
2.4%
5669
 
2.4%
4338
 
1.2%
7169
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.2028
100.0%
Dash Punctuation
ValueCountFrequency (%)
-852
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common31265
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
320280
64.9%
64901
 
15.7%
.2028
 
6.5%
21352
 
4.3%
-852
 
2.7%
0676
 
2.2%
5669
 
2.1%
4338
 
1.1%
7169
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII31265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
320280
64.9%
64901
 
15.7%
.2028
 
6.5%
21352
 
4.3%
-852
 
2.7%
0676
 
2.2%
5669
 
2.1%
4338
 
1.1%
7169
 
0.5%

delta_probability_fatality_suborbital
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
4.666666666666667
433 
-4.666666666666666
426 
-2.333333333333333
419 
2.3333333333333335
412 
0.0
338 

Length

Max length18
Median length18
Mean length15.28648915
Min length3

Characters and Unicode

Total characters31001
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-4.666666666666666
2nd row2.3333333333333335
3rd row-4.666666666666666
4th row-2.333333333333333
5th row2.3333333333333335

Common Values

ValueCountFrequency (%)
4.666666666666667433
21.4%
-4.666666666666666426
21.0%
-2.333333333333333419
20.7%
2.3333333333333335412
20.3%
0.0338
16.7%

Length

2023-04-30T11:58:22.699651image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:22.823618image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
4.666666666666667433
21.4%
4.666666666666666426
21.0%
2.333333333333333419
20.7%
2.3333333333333335412
20.3%
0.0338
16.7%

Most occurring characters

ValueCountFrequency (%)
312465
40.2%
612452
40.2%
.2028
 
6.5%
4859
 
2.8%
-845
 
2.7%
2831
 
2.7%
0676
 
2.2%
7433
 
1.4%
5412
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number28128
90.7%
Other Punctuation2028
 
6.5%
Dash Punctuation845
 
2.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
312465
44.3%
612452
44.3%
4859
 
3.1%
2831
 
3.0%
0676
 
2.4%
7433
 
1.5%
5412
 
1.5%
Other Punctuation
ValueCountFrequency (%)
.2028
100.0%
Dash Punctuation
ValueCountFrequency (%)
-845
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common31001
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
312465
40.2%
612452
40.2%
.2028
 
6.5%
4859
 
2.8%
-845
 
2.7%
2831
 
2.7%
0676
 
2.2%
7433
 
1.4%
5412
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII31001
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
312465
40.2%
612452
40.2%
.2028
 
6.5%
4859
 
2.8%
-845
 
2.7%
2831
 
2.7%
0676
 
2.2%
7433
 
1.4%
5412
 
1.3%

delta_probability_fatality_moon_trip
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.0 KiB
-2.333333333333333
602 
2.3333333333333335
595 
0.0
338 
4.666666666666667
250 
-4.666666666666666
243 

Length

Max length18
Median length18
Mean length15.37672584
Min length3

Characters and Unicode

Total characters31184
Distinct characters9
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.3333333333333335
2nd row-4.666666666666666
3rd row2.3333333333333335
4th row4.666666666666667
5th row-4.666666666666666

Common Values

ValueCountFrequency (%)
-2.333333333333333602
29.7%
2.3333333333333335595
29.3%
0.0338
16.7%
4.666666666666667250
12.3%
-4.666666666666666243
12.0%

Length

2023-04-30T11:58:22.957683image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-04-30T11:58:23.091965image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
2.333333333333333602
29.7%
2.3333333333333335595
29.3%
0.0338
16.7%
4.666666666666667250
12.3%
4.666666666666666243
12.0%

Most occurring characters

ValueCountFrequency (%)
317955
57.6%
67145
 
22.9%
.2028
 
6.5%
21197
 
3.8%
-845
 
2.7%
0676
 
2.2%
5595
 
1.9%
4493
 
1.6%
7250
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number28311
90.8%
Other Punctuation2028
 
6.5%
Dash Punctuation845
 
2.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
317955
63.4%
67145
 
25.2%
21197
 
4.2%
0676
 
2.4%
5595
 
2.1%
4493
 
1.7%
7250
 
0.9%
Other Punctuation
ValueCountFrequency (%)
.2028
100.0%
Dash Punctuation
ValueCountFrequency (%)
-845
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common31184
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
317955
57.6%
67145
 
22.9%
.2028
 
6.5%
21197
 
3.8%
-845
 
2.7%
0676
 
2.2%
5595
 
1.9%
4493
 
1.6%
7250
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII31184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
317955
57.6%
67145
 
22.9%
.2028
 
6.5%
21197
 
3.8%
-845
 
2.7%
0676
 
2.2%
5595
 
1.9%
4493
 
1.6%
7250
 
0.8%

average_price_dollars
Real number (ℝ≥0)

HIGH CORRELATION

Distinct87
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65352.42234
Minimum150
Maximum906250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.0 KiB
2023-04-30T11:58:23.224752image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum150
5-th percentile1716.666667
Q114591.66667
median42466.66667
Q386562.5
95-th percentile212333.3333
Maximum906250
Range906100
Interquartile range (IQR)71970.83333

Descriptive statistics

Standard deviation74740.40561
Coefficient of variation (CV)1.14365165
Kurtosis15.97366737
Mean65352.42234
Median Absolute Deviation (MAD)34304.16667
Skewness2.919543249
Sum132534712.5
Variance5586128231
MonotonicityNot monotonic
2023-04-30T11:58:23.375159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58862.592
 
4.5%
21458.3333386
 
4.2%
14591.6666784
 
4.1%
86562.578
 
3.8%
24237.568
 
3.4%
6008.33333366
 
3.3%
4155060
 
3.0%
103133.333356
 
2.8%
52204.1666752
 
2.6%
65520.8333352
 
2.6%
Other values (77)1334
65.8%
ValueCountFrequency (%)
15010
 
0.5%
3752
 
0.1%
52514
 
0.7%
90016
 
0.8%
933.333333310
 
0.5%
127520
1.0%
1716.66666742
2.1%
187514
 
0.7%
2333.3333332
 
0.1%
26259
 
0.4%
ValueCountFrequency (%)
9062501
 
< 0.1%
6343752
 
0.1%
606666.66671
 
< 0.1%
4531254
 
0.2%
452083.33332
 
0.1%
424666.66674
 
0.2%
385416.66672
 
0.1%
3462502
 
0.1%
322916.66674
 
0.2%
317187.510
0.5%

delta_price_dollars_orbital
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct110
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean460.3920118
Minimum-520833.3333
Maximum419416.6667
Zeros169
Zeros (%)8.3%
Negative1014
Negative (%)50.0%
Memory size16.0 KiB
2023-04-30T11:58:23.508017image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-520833.3333
5-th percentile-130208.3333
Q1-20970.83333
median-391.6666667
Q335950
95-th percentile104854.1667
Maximum419416.6667
Range940250
Interquartile range (IQR)56920.83333

Descriptive statistics

Standard deviation70804.38836
Coefficient of variation (CV)153.7915223
Kurtosis5.438636278
Mean460.3920118
Median Absolute Deviation (MAD)27025
Skewness-0.3777568901
Sum933675
Variance5013261411
MonotonicityNot monotonic
2023-04-30T11:58:23.635310image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0169
 
8.3%
-9791.66666772
 
3.6%
-6658.33333364
 
3.2%
50929.1666752
 
2.6%
-2741.66666752
 
2.6%
74895.8333350
 
2.5%
-44270.8333340
 
2.0%
20970.8333340
 
2.0%
-13708.3333340
 
2.0%
-50929.1666736
 
1.8%
Other values (100)1413
69.7%
ValueCountFrequency (%)
-520833.33331
 
< 0.1%
-3920001
 
< 0.1%
-364583.33333
 
0.1%
-299583.33331
 
< 0.1%
-2800002
 
0.1%
-260416.66676
 
0.3%
-209708.33333
 
0.1%
-1960009
0.4%
-182291.66672
 
0.1%
-182291.666719
0.9%
ValueCountFrequency (%)
419416.66671
 
< 0.1%
3387501
 
< 0.1%
299583.33334
 
0.2%
260416.66671
 
< 0.1%
2371253
 
0.1%
2212501
 
< 0.1%
209708.333314
0.7%
182291.66672
 
0.1%
1693756
0.3%
1548753
 
0.1%

delta_price_dollars_suborbital
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct116
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-230.1960059
Minimum-599166.6667
Maximum419416.6667
Zeros169
Zeros (%)8.3%
Negative764
Negative (%)37.7%
Memory size16.0 KiB
2023-04-30T11:58:23.776379image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-599166.6667
5-th percentile-140000
Q1-20970.83333
median4700
Q337612.5
95-th percentile104854.1667
Maximum419416.6667
Range1018583.333
Interquartile range (IQR)58583.33333

Descriptive statistics

Standard deviation79533.63183
Coefficient of variation (CV)-345.5039609
Kurtosis6.472710744
Mean-230.1960059
Median Absolute Deviation (MAD)31758.33333
Skewness-1.013494704
Sum-466837.5
Variance6325598591
MonotonicityNot monotonic
2023-04-30T11:58:23.922192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0169
 
8.3%
19583.3333356
 
2.8%
44270.8333352
 
2.6%
65104.1666750
 
2.5%
13316.6666748
 
2.4%
18229.1666740
 
2.0%
5483.33333339
 
1.9%
-149791.666738
 
1.9%
-6658.33333336
 
1.8%
-101858.333336
 
1.8%
Other values (106)1464
72.2%
ValueCountFrequency (%)
-599166.66671
 
< 0.1%
-5600001
 
< 0.1%
-419416.66673
 
0.1%
-3920003
 
0.1%
-299583.33337
 
0.3%
-2800006
 
0.3%
-209708.333322
1.1%
-19600019
0.9%
-182291.66671
 
< 0.1%
-149791.666738
1.9%
ValueCountFrequency (%)
419416.66671
 
< 0.1%
364583.33331
 
< 0.1%
299583.33333
 
0.1%
260416.66674
 
0.2%
2371251
 
< 0.1%
2212501
 
< 0.1%
209708.333311
0.5%
182291.66675
0.2%
182291.66679
0.4%
1693752
 
0.1%

delta_price_dollars_moon_trip
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct127
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-230.1960059
Minimum-599166.6667
Maximum599166.6667
Zeros169
Zeros (%)8.3%
Negative926
Negative (%)45.7%
Memory size16.0 KiB
2023-04-30T11:58:24.061050image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-599166.6667
5-th percentile-140000
Q1-20970.83333
median0
Q336458.33333
95-th percentile118562.5
Maximum599166.6667
Range1198333.333
Interquartile range (IQR)57429.16667

Descriptive statistics

Standard deviation82803.26818
Coefficient of variation (CV)-359.7076668
Kurtosis7.750530459
Mean-230.1960059
Median Absolute Deviation (MAD)31250
Skewness-0.6113856292
Sum-466837.5
Variance6856381222
MonotonicityNot monotonic
2023-04-30T11:58:24.197238image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0169
 
8.3%
-6658.33333368
 
3.4%
-9791.66666768
 
3.4%
-2741.66666753
 
2.6%
-50929.1666740
 
2.0%
-13708.3333339
 
1.9%
57587.536
 
1.8%
-9520036
 
1.8%
44270.8333336
 
1.8%
50929.1666736
 
1.8%
Other values (117)1447
71.4%
ValueCountFrequency (%)
-599166.66671
 
< 0.1%
-5600001
 
< 0.1%
-520833.33331
 
< 0.1%
-419416.66672
 
0.1%
-3920003
 
0.1%
-364583.33332
 
0.1%
-299583.33334
 
0.2%
-2800006
0.3%
-260416.66674
 
0.2%
-209708.333312
0.6%
ValueCountFrequency (%)
599166.66671
 
< 0.1%
419416.66672
 
0.1%
364583.33331
 
< 0.1%
3387501
 
< 0.1%
299583.33335
 
0.2%
260416.66673
 
0.1%
2371253
 
0.1%
209708.333313
0.6%
182291.66673
 
0.1%
182291.66679
0.4%

Interactions

2023-04-30T11:58:11.163544image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:07.600781image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.330253image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.109775image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.767123image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.425072image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.290626image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:07.774337image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.447903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.215195image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.870597image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.549456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.427292image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:07.895149image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.559950image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.324851image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.976534image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.671732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.549481image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.002403image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.664789image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.433658image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.079427image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.796906image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.678982image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.109082image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.768075image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.543685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.175808image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.918026image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.821992image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:08.215526image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.000484image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:09.658365image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:10.287668image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-04-30T11:58:11.039572image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-04-30T11:58:24.307431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-04-30T11:58:24.460359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-04-30T11:58:24.881951image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-04-30T11:58:25.067812image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-04-30T11:58:25.402830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-04-30T11:58:12.115857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-30T11:58:13.136788image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-04-30T11:58:13.380132image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-04-30T11:58:13.506345image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexchoicegenderannual_incomehousehold_annual_incomenumber_vehicleslevel_educationwork_typechildren_homehousehold_typestatus_in_householdtype_residencehousing_tenure_typeoriginracecitizenshiprisk_activities_sportsprice_attribute_suborbitalavailability_suborbitaltraining_suborbitalnumber_passengers_suborbitaltakeoff_location_suborbitalprice_attribute_orbitalavailability_orbitaltraining_orbitalnumber_passengers_orbitaltakeoff_location_orbitalprice_attribute_moon_tripavailability_moon_triptraining_moon_tripnumber_passengers_moon_triptakeoff_location_moon_tripagegeneration_agecitystateregionaverage_probability_fatalitydelta_probability_fatality_orbitaldelta_probability_fatality_suborbitaldelta_probability_fatality_moon_tripaverage_price_dollarsdelta_price_dollars_orbitaldelta_price_dollars_suborbitaldelta_price_dollars_moon_trip
00moon_tripmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever362_perc_annual_incomeimmediatenooneusa362_perc_annual_incomeimmediateyesoneusa3_perc_annual_incomeimmediatenomore_than_oneusa46millenialsRochesterMNmidwest2.8333332.333333-4.6666672.333333303333.333333-149791.666667-149791.666667299583.333333
11suborbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever362_perc_annual_incomeimmediateyesoneusa362_perc_annual_incomeimmediateyesmore_than_oneother362_perc_annual_incomein_5_yearsnooneusa46millenialsRochesterMNmidwest2.8333332.3333332.333333-4.666667453125.0000000.0000000.0000000.000000
22moon_tripmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever50_perc_annual_incomein_5_yearsyesoneusa362_perc_annual_incomein_5_yearsnooneother50_perc_annual_incomeimmediatenomore_than_oneother46millenialsRochesterMNmidwest2.8333332.333333-4.6666672.333333192708.333333-260416.666667130208.333333130208.333333
33moon_tripmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever3_perc_annual_incomein_5_yearsnomore_than_oneusa50_perc_annual_incomeimmediatenooneother50_perc_annual_incomein_5_yearsyesoneusa46millenialsRochesterMNmidwest5.166667-2.333333-2.3333334.66666742916.666667-19583.33333339166.666667-19583.333333
44suborbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever50_perc_annual_incomein_5_yearsyesmore_than_oneusa50_perc_annual_incomeimmediatenomore_than_oneother3_perc_annual_incomeimmediateyesmore_than_oneusa46millenialsRochesterMNmidwest2.8333332.3333332.333333-4.66666742916.666667-19583.333333-19583.33333339166.666667
55suborbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever3_perc_annual_incomein_5_yearsnooneother50_perc_annual_incomein_5_yearsyesmore_than_oneusa50_perc_annual_incomeimmediatenooneother46millenialsRochesterMNmidwest2.8333332.3333332.333333-4.66666742916.666667-19583.33333339166.666667-19583.333333
66suborbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever3_perc_annual_incomeimmediateyesoneother50_perc_annual_incomeimmediatenomore_than_oneother50_perc_annual_incomein_5_yearsyesoneusa46millenialsRochesterMNmidwest5.166667-2.3333334.666667-2.33333342916.666667-19583.33333339166.666667-19583.333333
77orbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever50_perc_annual_incomeimmediatenomore_than_oneother3_perc_annual_incomeimmediatenomore_than_oneusa362_perc_annual_incomeimmediateyesmore_than_oneother46millenialsRochesterMNmidwest0.5000000.0000000.0000000.000000173125.000000169375.000000110625.000000-280000.000000
88not_travelmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever50_perc_annual_incomeimmediatenomore_than_oneother50_perc_annual_incomein_5_yearsyesoneusa362_perc_annual_incomein_5_yearsnooneother46millenialsRochesterMNmidwest5.166667-2.333333-2.3333334.666667192708.333333130208.333333130208.333333-260416.666667
99orbitalmale100k_150k150k_200k1_cargrad_prof_degreeprivate1_childcouple_with_childrenheadhouseownnon_hispanicasianus_citizennever362_perc_annual_incomein_5_yearsnomore_than_oneother50_perc_annual_incomein_5_yearsyesoneusa3_perc_annual_incomein_5_yearsyesmore_than_oneother46millenialsRochesterMNmidwest5.1666674.666667-2.333333-2.333333173125.000000110625.000000-280000.000000169375.000000

Last rows

df_indexchoicegenderannual_incomehousehold_annual_incomenumber_vehicleslevel_educationwork_typechildren_homehousehold_typestatus_in_householdtype_residencehousing_tenure_typeoriginracecitizenshiprisk_activities_sportsprice_attribute_suborbitalavailability_suborbitaltraining_suborbitalnumber_passengers_suborbitaltakeoff_location_suborbitalprice_attribute_orbitalavailability_orbitaltraining_orbitalnumber_passengers_orbitaltakeoff_location_orbitalprice_attribute_moon_tripavailability_moon_triptraining_moon_tripnumber_passengers_moon_triptakeoff_location_moon_tripagegeneration_agecitystateregionaverage_probability_fatalitydelta_probability_fatality_orbitaldelta_probability_fatality_suborbitaldelta_probability_fatality_moon_tripaverage_price_dollarsdelta_price_dollars_orbitaldelta_price_dollars_suborbitaldelta_price_dollars_moon_trip
20182150suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever50_perc_annual_incomein_5_yearsyesoneusa362_perc_annual_incomeimmediateyesmore_than_oneusa362_perc_annual_incomeimmediatenooneusa41millenialsPalm SpringsCAwest5.166667-2.3333334.666667-2.333333226041.666667-91145.833333182291.666667-91145.833333
20192151suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever3_perc_annual_incomeimmediatenomore_than_oneusa3_perc_annual_incomeimmediatenooneusa3_perc_annual_incomein_5_yearsnooneother41millenialsPalm SpringsCAwest0.5000000.0000000.0000000.0000002625.0000000.0000000.0000000.000000
20202152suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever50_perc_annual_incomein_5_yearsnomore_than_oneusa3_perc_annual_incomeimmediateyesmore_than_oneusa362_perc_annual_incomeimmediateyesmore_than_oneusa41millenialsPalm SpringsCAwest5.166667-2.3333334.666667-2.333333121187.500000118562.50000077437.500000-196000.000000
20212153suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever3_perc_annual_incomeimmediateyesmore_than_oneusa362_perc_annual_incomein_5_yearsyesmore_than_oneother50_perc_annual_incomein_5_yearsyesmore_than_oneusa41millenialsPalm SpringsCAwest2.8333332.333333-4.6666672.333333121187.500000-196000.000000118562.50000077437.500000
20222154suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever362_perc_annual_incomeimmediateyesmore_than_oneusa3_perc_annual_incomein_5_yearsnooneother362_perc_annual_incomein_5_yearsyesmore_than_oneother41millenialsPalm SpringsCAwest2.8333332.333333-4.6666672.333333212333.333333209708.333333-104854.166667-104854.166667
20232155suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever362_perc_annual_incomein_5_yearsnooneother3_perc_annual_incomein_5_yearsyesmore_than_oneother3_perc_annual_incomein_5_yearsyesoneother41millenialsPalm SpringsCAwest5.166667-2.3333334.666667-2.333333107479.166667104854.166667-209708.333333104854.166667
20242156suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever50_perc_annual_incomeimmediatenooneother362_perc_annual_incomein_5_yearsnomore_than_oneother50_perc_annual_incomeimmediateyesoneother41millenialsPalm SpringsCAwest7.5000000.0000000.0000000.000000134895.833333-182291.66666791145.83333391145.833333
20252157suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever50_perc_annual_incomeimmediateyesoneother3_perc_annual_incomeimmediateyesoneother50_perc_annual_incomein_5_yearsnomore_than_oneusa41millenialsPalm SpringsCAwest2.8333332.333333-4.6666672.33333330041.66666727416.666667-13708.333333-13708.333333
20262158suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever3_perc_annual_incomein_5_yearsyesoneother50_perc_annual_incomeimmediateyesoneother3_perc_annual_incomeimmediatenooneusa41millenialsPalm SpringsCAwest5.166667-2.333333-2.3333334.66666716333.333333-27416.66666713708.33333313708.333333
20272159suborbitalmale75k_100k75k_100k1_cargrad_prof_degreeprivate2_childrencouple_with_childrenheadhouseownnon_hispanicwhiteus_citizennever362_perc_annual_incomein_5_yearsyesmore_than_oneother362_perc_annual_incomeimmediatenooneusa50_perc_annual_incomeimmediatenomore_than_oneother41millenialsPalm SpringsCAwest5.166667-2.3333334.666667-2.333333226041.666667-91145.833333-91145.833333182291.666667